Lexical Annotation for Multi-word Entries Containing Nominalizations
نویسنده
چکیده
New York University has produced a dictionary of nominalizations (NOMLEX) whose entries capture the relationship of the nominalization with its associated verb. This dictionary indicates where the verbal arguments may be found in the noun phrase which contains the nominalization. We have now made a study and produced some entries for nominalizations and their co-occurring verbs. These entries are much more complex than NOMLEX entries. In order to express all the relationships between the nominalization and its co-occurring verb, we made use of the terminology of Igor Mel’ĉuk, whose theories have been used to create dictionaries in French and Russian. His categories were found to be very useful for this task. The verb + nominalization pairs were selected by frequency of co-occurrence and thus do not strictly conform to what are considered support verbs. Support verbs are generally defined as having no semantic content, serving only to carry tense and number which the nominalization cannot express. A typical example of this is “commit a murder”. The paper below describes the NOMLEX entry which is the basis of this work and then demonstrates the additional information needed to describe the verb + nominalization pair.
منابع مشابه
A Chinese Corpus with Word Sense Annotation
This paper presents the construction of a Chinese word sense-tagged corpus. The resulting lexical resource includes mainly three components: 1) a corpus annotated with word senses; 2) a lexicon containing sense distinction and description in the feature-based formalism; 3) the linking between the sense entries in the lexicon and CCD synsets. A dynamic model is put forward to build the three kno...
متن کاملApproximating the disambiguation of some German nominalizations by use of weak structural, lexical and corpus information Hacía la desambiguación de nominalizaciones en alemán a partir de información estructural, léxica y de corpus
Between classical symbolic word sense disambiguation (wsd) using explicit deep semantic representations of sentences and texts and statistical wsd using word co-occurrence information, there is a recent tendency towards mediating methods. Similar to so-called lightweight semantics (Marek, 2009) we suggest to only make sparse use of semantic information. We describe an approximation model based ...
متن کاملOn multiword lexical units and their role in maritime dictionaries
Multi-word lexical units are a typical feature of specialized dictionaries, in particular monolingual and bilingual maritime dictionaries. The paper studies the concept of the multi-word lexical unit and considers the similarities and differences of their selection and presentation in monolingual and bilingual maritime dictionaries. The work analyses such issues as the classification of multi-w...
متن کاملAnCora-Nom: A Spanish Lexicon of Deverbal Nominalizations
This paper describes a new lexical resource: Ancora-Nom, a Spanish lexicon of deverbal nominalizations. At present, it contains 1,655 lexical entries and 3,094 senses. Each sense has a denotation type associated, and the mapping of nominal complements with arguments and the corresponding theta roles is also annotated. A particular interest of this lexicon is that it has been automatically extra...
متن کاملTowards Best Practice for Multiword Expressions in Computational Lexicons
The importance and role of multi-word expressions (MWE) in the description and processing of natural language has been long recognized. However, multi-word information has often been relegated to the marginal role of idiosyncratic lexical information. The need for MWE lexicons grows even more acute for multi-lingual applications, for which (sometimes complex) correspondences must be identified,...
متن کامل